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Abstract. We prove the free analogue of the transportation cost inequality for tracial 
distributions of non-commutative self-adjoint (also unitary) multi- variables based on random 
matrix approximation procedure. 

Introduction 

The transportation cost inequality (TCI) gives an upper bound for the quadratic Wasser- 
stein distance by the square roof of the relative entropy. For probability measures on a Polish 
space X, the relative entropy is S(fi,u) = f x log -£; dfi if /J, <C v (otherwise, S(fi,u) = +00) 
while the quadratic Wasserstein distance is defined as 

W 2 (m, u) : = inf JJJ x d(x,y)*dir(x,y), (0.1) 

where d(x, y) is the metric on X and ir runs over the probability measures on X x X with 
marginals fx and v. In 1996, M. Talagrand j5U] obtained the celebrated TCI W<i{fi,v) < 
y // 2S(fi, v) for probability measures on R n , where v is the standard Gaussian measure on 
W 1 . Since then, the TCI has been received a lot of attention. It was shown by F. Otto and 
C. Villani ^Hj that, in the Riemannian manifold setting, the TCI follows from the logarithmic 
Sobolev inequality (LSI) of D. Bakry and M. Emery p^. The LSI gives a lower bound for 
the relative Fisher information by the relative entropy, which has played important roles in 
several contexts. Recent developments in both LSI and TCI are found in |16l I21j for example. 

On the other hand, Ph. Biane and D. Voiculescu [I] proved the free analogue of Talagrand's 
TCI for compactly supported measures on M, where the relative entropy is replaced by its 
free analogue and the Gaussian measure by the semicircular one. In |111 I12| we developed 
the random matrix approximation method to obtain a slight generalization of Biane and 
Voiculescu's free TCI as well as its counterpart on the circle T. The free analogues of the 
LSI's on IR and on T were also obtained in [5j and |111 113j by the same method. 

Recently, M. Ledoux used a similar random matrix technique to prove the free ana- 
logue of the Brunn-Minkowski inequality for measures on M, from which (together with the 
Hamilton-Jacobi approach) he gave short proofs of the free TCI and LSI for measures on M. 
Furthermore, his approach was shown in ^U] to be still applicable for getting the free TCI in 
j!2j for measures on T. 

The free TCI's and LSI's so far are restricted to measures on M or T and are not truly 
non-commutative. However, Voiculescu's free entropy for multi-variables was well developed 
in j2Hj (see also j2S] and [HJ Chap. 6]) and the Wasserstein distance was also introduced in jlj 
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for multi- variables in the C* -algebra setting; so we must be in a good position to extend the 
free TCI to non-commutative multi- variables. This is what we are going to do here. In fact, 
we will show the truly non-commutative free TCI when the "reference distribution" is chosen 
to be that associated with freely independent (self-adjoint or unitary) random variables. It of 
course includes the above-mentioned free analogue of Talagrand's TCI. However, the present 
work is still in a very beginning in this direction of the subject matter. For example, it is 
interesting to seek for a non-commutative generalization of the above-mentioned Ledoux's 
approach, which probably brings a new insight into free probability theory. 

In this paper, after preliminaries on the Wasserstein distance in §1 following [3], we obtain 
in §2 the free TCI for non-commutative tracial distributions of self-adjoint multi-variables 
with respect to a certain free product distribution (see Theorem 12 .2JI . In §3 we present a 
sharper TCI (see Theorem 13.1(1 by replacing the free entropy with another free entropy-like 
quantity (introduced in from the viewpoint of statistical mechanics) but tracial distribu- 
tions are rather restricted. Furthermore, the counterparts of these free TCI's in the unitary 
setting are sketched in §4 without much details for proofs. 



1. Preliminaries 

1.1. Notations. When A is a unital C*-algebra, A sa stands for the set of self-adjoint ele- 
ments of A, and we denote by S(A) the state space of A and by TS(A) the tracial state 
space of A, i.e., the set of all tp E S(A) such that tp(ab) = tpiba), a,b E A. The universal 
free product C*-algebra of two copies of A is denoted by A^kA, and a\ and <72 stand for the 
canonical embedding maps of A into the left and right copies of A in A^kA, respectively. 
Moreover, the universal free product C*-algebra of n copies of A is simply written as A* n . 
A pair (A, r) with r E TS(A) is called a tracial C* -probability space, and when A is a von 
Neumann algebra and t is a faithful normal tracial state it is called a tracial W* -probability 
space. 

The usual non- normalized trace on the N x N complex matrix algebra M^r(C) is denoted 
by Tr^, and ||^4||#,g is the Hilbert-Schmidt norm of A E M N (C), i.e., \\A\\hs ■= y/Tr N (A*A) 
while 1 1 A 1 1 is the operator norm of A. We denote by Mff the set of all self-adjoint A E M^r(C) 
and by Ajy the Lebesgue measure on Mf^ with the obvious Euclidean structure. As usual, 
U(iV) and SU(iV) are the unitary and special unitary groups of order N. We denote by 7^ 
and 7^ the Haar probability measures on U(iV) and SU(iV), respectively. We also denote 
by V{X) the set of all Borel probability measures on a Polish space X. 

1.2. Non-commutative distributions. Slightly unlike the usual, we will employ the scheme 
in [7j to deal with "non-commutative distributions." Let us fix n E N and R > 0. An un- 
derlying C*-algebra we adopt is the universal free product C*-algebra A^ := C([— R, i?])* n 
with norm || • \\r and a canonical set of self-adjoint generators Xi(t) = t in the ith copy 
of C([-R,R]), 1 < % < n. Each tp E S(A { ^) provid es a distribution or law of X\ , . . . , X n 
whose (non-commutative) moments are given by tp ( ■■■ Xj m )'s. Any distribution in the 
C*-algebra setting can be indeed realized in this way. More precisely, if a\,... ,a n are self- 
adjoint variables in a C*-probability space (A,tp) with operator norm \\ai\\ < R, then one 
has a (unique) *-homomorphism ^ from A^ into A sending each X{ to a% so that the dis- 
tribution of X\, . . . , X n under tp o E S(A^) coincides with that of a\, . . . , a n under tp. 
Our main objects in the paper are the Wasserstein distance and the free entropy, which have 
been well developed only in terms of tracial states. Thus, in what follows, we will restrict 
our consideration only to tracial distributions, i.e., elements in TS(A^ ). 
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The (microstates) free entropy x introduced by Voiculescu j2H| is defined in our context 

for every r G TS(A R ) as follows: Let tt t be the GNS representation of associated with 

r; then we have the tracial W"-probability space (^tt t (.4^)", t^J with the normal extension 

t of r together with the self-adjoint variables 7r T (Xi), . . . , ir T (X n ). Then, the free entropy of 
r at our disposal is 

x(t) ■■= x(kt{Xi), . . .,ir T (X n )) = xr(k t {Xi),.. . ,vr T (X n )) 

(see and also 6.3.6] for the latter equality). By definition the free entropy x{ T ) ls 
determined only by the moments of r in (X\, . . . , X n ) independently of a particular choice of 
R > 0. (This is also the case for the Wasserstein distance as will be seen in §§1.3.) 

Here, let us introduce a certain class of non-commutative distributions coming from so- 
called matrix integrals, which will play an important role in the paper. For each N G N and 
Ai, . . . ,A n G Mff with < R we have the "non-commutative functional calculus" 

h G A^ R I— > h(Ai, . . . , A n ) G Mjy(C) 

that is the canonical *-homomorphism from A R into M^r(C) sending each to A%. Let r# 
be the retraction of M onto [— R, R], i.e., 

'-i? if t < —12, 
r R (t) := < t if —R <t < R, 
R if t > R. 

v 

The next lemma is quite easy to show from the obvious inequality \r R (a) — r R ([3)\ < \ct — j3\ 
for a, (3 G R; so we omit the proof. 

Lemma 1.1. We have \\r R (A) - r R (B)\\ HS < \\A - B\\ HS for every A,B G Mff . 

Hence, a usual approximation argument shows that the function (A%, . . . , A n ) \— ► fo(rR(4i), 
. . . , r R (A n )) is continuous on (Mff) n = ~M, N n with respect to the Euclidean structure for 
each fixed h G A R ■ Thus, each probability measure A G 7 3 ((M^ l ) n ) gives rise to the tracial 
distribution A# G T,S(^4.^) defined by 

X R (h):= f ±-Tr N (h(r R (Ai),...,r R (A n )))d\(A 1 ,...,A n ), h G A { £> . 

We call this X R the random matrix distribution associated with A. When the measure A is 
supported in (Mff R ) n where Mff R := {A G Mff : \\A\\ < R}, the retraction r R is of course 

not needed in the above definition so that X R is simply defined by integrating over (Mf^ R ) n . 

1.3. Wasserstein distance. This part is from [I] with slight modifications. Let (ai, . . . ,a n ) 
and (bi,...,b n ) be n-tuples of non-commutative variables in tracial C*-probability spaces 
(^4i,ri) and (.42, T2), respectively. Here, it may be emphasized that a^'s as well as b^s are 
not necessarily self-adjoint (even not normal). We write (ai, . . . ,a n ) ~ (61,..., b n ) if the 
*-distributions (or *-moments) of (a±, . . . , a n ) and of (b\, ... , b n ) are same, i.e., 

n« 1 ---4r)=r 2 (6^---6-) 

for all m G N, ii, . . . ,i m G {1, . . . , n} and e\,...,e m G {1, *}. For 1 < p < 00, the p- 
Wasserstein distance introduced in [I] is defined by 

/ n \ Vp' 



W p ((ai, . . . ,a n ), (61, . . . ,b n )) := inf < 
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where infimum is taken over all 2n-tuples (a' 1; a' n ,b' 1} ... ,b' n ) in some tracial C*-probability 
space (A, r) such that (a^, . . . , a' n ) ~ (ai, . . . , a n ) and (6^, . . . , b' n ) ~ (61, . . . , & n ). The defi- 
nition itself says that the quantity depends only on the *-moments of (a%, . . . ,a n ) and 
(61, . . .,b n ). 

Another definition of W p was also introduced in [I], which is a bit more tractable than 
the above. Let A be a unital C*-algebra with a specified n-tuple (ai, ... ,a n ) of generators. 
For a given pair ti,t 2 G TS(A) we define the set of (non-commutative tracial) joining states 
between t\ and r 2 by 

TS(A+A I n, r 2 ) := {r G T,S(^*^) : r o oi = n, r o a 2 = r 2 } . 

For 1 < p < 00, the p-Wasserstein distance between n and r 2 is defined by 

W p {n,n) :=inf l^rd^iCai)-^)! 1 ')^ :r € TS^In.T*)}. (1.2) 

As remarked in 01 §§1.2], the two definitions (|1.1|) and (|1.2() give the same quantity in the 
following way. 

Proposition 1.2. For every t\, t 2 £ TS'(.A), Zei (a' l5 . . . , a^J and {a![, . . . , a") 6e (oj, . . . , a n ) 
in (A, T\) and in [A, r 2 ), respectively. Then we have 

W p (t 1 ,t 2 ) = W p ((a' 1 ,...,a' n ),(a",...,a,n)). 

The proof is easily done by manipulating appropriate GNS representations so that we leave 
it to the reader. An important consequence of the proposition is that Wp(ri,r 2 ) in 1)1.2(1 is 
independent of a particular choice of A with a specified n-tuple (ai, . . . ,a n ); namely, it is 
determined only by the *-moments of n and r 2 in (ai, . . . , a n ). 

Basic properties of W p are in order. 

1° Wp(ri,r 2 ) is a metric on TS(A) (see jU Theorem 1.3]). 

2° Wp(ri,r 2 ) is jointly lower semi-continuous in (ti,t 2 ) G T5(«4) x TS(A) in weak*- 
topology (see (U Proposition 1.4]). 

3° Wp(ri,T 2 ) p is jointly convex in (ti,t 2 ) G TS 1 ^) x T5(«4). (This is easy to prove 
though not included in jl].) 

4° If a\,...,a n are self-adjoint (or more generally normal) and mutually commuting, 
then Wp(ri,T 2 ) coincides with the usual p-Wasserstein distance W p ((j,i, /x 2 ) (see (|0.1|l l. 
where ui,/u 2 G "P(lR n ) (or V(C n )) are the spectral distribution measures of the re- 
tuple (ai,...,o n ) constructed via the GNS representations associated with ti,t 2 , 
respectively (see 01 Theorem 1.5]). 

We will treat the (quadratic) 2-Wasserstein distance lF 2 (ri,r 2 ) for tracial distributions of 
self-adjoint random variables in §2, §3 and for those of unitary random variables in §4. In 
the self-adjoint case, we will take the universal A^ with the specified self-adjoint generators 
Xi, . . . , X n . This is indeed universal in the sense that when a\, . . . , a n are self-adjoint in any 
^4, any tracial distribution of (ai, . . . ,a n ) can be realized via some r G TS[A^) as long as 
R > ||o»||> 1 < i < re (see §§1.2). 

In this subsection, we provide an inequality between the free and usual 2-Wasserstein 
distances for random matrix distributions introduced in §§1.2, which will be one of the keys 
in our later discussions. The inequality corresponds to that in ^21 Lemmas 2.6 and 2.8]; 
however the argument here is simpler than there because we do not (indeed cannot) treat the 
"eigenvalue distributions." 
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Lemma 1.3. For every pair Ai, A 2 G V((Mff) n ) and every R > 0, let X ljR , X 2>R G TS(aP) 
be the corresponding random matrix distributions. Then we have 

W 2 (\r,X 2 ,r) < -7=W2(Ai,A 2 ), 
v N 



where W 2 (Xi, A 2 ) is the usual 2-Wasserstein distance between Ai, A2 defined by 



inf // V \\Ai - Bi\\ 2 HS dir 

over the joining measures it on (Mff) n x (Mf^) n of X±,X 2 , i.e., measures whose marginals 
are X\, X 2 (see also ljU.l|) ). 

Proof. For each n-tuple A = (A%, . . . , A n ) G (Mf^) n one has the *-homomorphism 

^r '■ h G A% ] 1 * ^(rji(Ai), . . . , r fl (A0) G Mat(C) 

sending to ^(Aj) (see §§1.2), and moreover for each A, B G (Mff) n there is a unique 
*-homomorphism 

determined by 



, Tl A,B , Tl A ,t,A,B , T ,B 

°vi = ^r, * K oa 2 = ^ R . 

A. B 

As in §§1.2, the function (A, B) ^ R ' (/i) is continuous with respect to the Hilbert-Schmidt 
norms for each fixed h G A^+A^ ; hence every joining measure it of Ai, A2 gives rise to the 
tracial distribution ttr G TS(A^^kA^) defined by 

nn(h) ■= If ±-Tr N (*% B (h)) dn(A,B), h G A^+Ag, 

which satisfies 

ttr o o\ = Xl t R, TTR o a 2 = X 2t R. 
Therefore, we have ttr G TSlA^+A^lX i,R, ^2,r) so that 

^ 2 (Ai jjR ,A 2 ,fl) 2 < 9 R (j2MXi) - <t 2 (*<)) 2 J 

= //E^Tr w (^ B ((a 1 (^)- f r 2 (X,)) 2 ))^(A,B) 
J J i=i 

= [fib ^ (m*) - ^(^)) 2 ) m a, b) 

1=1 

= //E4ll^)-^)llH5^(A,B) 

J J 1=1 

^ ffj2W A i- B ^Hsd7r(A,B), 



i=l 

where the latter inequality is due to Lemma ll.ll Hence, the desired inequality follows by 
taking the infimum of the last integral over the joining measures it oi X±, X 2 . □ 
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Finally, we remark that the 2-Wasserstein distance is sometimes defined with the cost 
function of the form ~ x (distance) 2 . In fact, in ^3^DE] we adopted the definition with a 
^-multiple constant so that the bounds of TCI's there and in the present paper are 2 times 
different. 

2. Free TCI for x 

We will obtain the free TCI for non-commutative tracial distributions with respect to 
the distribution of freely independent random variables, including a natural free analogue of 
celebrated Talagrand's TCI [2U] with respect to the standard Gaussian measure on W L . 

Let Q = (Q\, . . . , Q n ) be an n-tuple of real- valued continuous functions on R with 

lim exp(— eQi(x)) = for every e > 0. (2-1) 

\x\— >00 

Then, for each Qi we define the N x N self-adjoint random matrix Ajv(Qi) £ V{M^) by 

dX N (Qi)(A) := - ) n . exp(-NTr N (Qi(A))) dA N (A) 

with a normalization constant Zn(Qi), whose mean eigenvalue distribution on R is denoted 
by Ajv(Qi)- With Q := Qi, a fundamental result in the theory of weighted potentials (see 
|19l 1.1.3]) tells us that the functional 

-E(yu) + n(Q) := - // \og\x - y\dfi{x)dn(y) + / Q(x)dfj,(x), \x 

J M 2 JR 

has a unique minimizer \jlq which is compactly supported and called the equilibrium measure 
associated with Q. For example, when Q{x) = x 2 /2, fiQ is the (0, l)-semicircular distribution 
djo,2(x) := 7^\/4 — x 2 dx supported on [—2,2]. Furthermore, the large deviation principle for 

self-adjoint random matrices (see [2], 5.4.3]) shows that \n(Q) weakly converges to the 
equilibrium measure (jlq. Let i?o > be the smallest such that all hq^s are supported in 
[— i?0j Ro]', f° r example, Rq = 2 when Qi(x) = x 2 /2 for all i. We then notice that 

X N (Q t )(M^ Ro ) = X N (Q t )([-R ,R }) fi Ql ([-R ,R ]) = 1 (2.2) 
as N —> 00 for every 1 < i < n. Let us consider the product measure 

n 

Ajv(Q) := ® Ajv(Qi) G V{{M s N a ) n ), 
i=i 

that is, 

d\ N (Q)(A 1 , ...,A n ) = —L- exp f -N V Tr^Q^)) ) dA% n (A 1 , . . . , A 



z N (Q) 



with Zjv(Q) := YYi=i Zn(Qi), and Ajv,ii(Q) £ T,S(.A^) denotes the random matrix distri- 
bution associated with Aat(Q) (see §§1.2). Furthermore, when R > i?o, define the tracial 

distribution t q G T5(^ } ) to be the free product state *f =1 MQ l on A$ = C([-R, R])* n , 
where each HQ i is meant a state on C([—R,R\) defined by integration. (Note that the mo- 
ments of tq is independent of a choice of R > Rq.) 

We begin by restating the so-called asymptotic freeness due to Voiculescu j2U in our 
situation. 

Lemma 2.1. Whenever R> Rq we have 

lim A at r (Q) = tq weakly*. 
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Proof. Fix an arbitrary R> Rq. By (|2.2j) we get 

n 

\ N (Q){(M s N y) n ) = Y[X N (Qi)([-R,R]) — 1 as A^oc. 



i=l 



For any non-commutative polynomial p in X\, . . . , X n (g Ar) we have 



Ajv,fl(Q)(p)= / -Tr 7V (p(r i? (A 1 ),...,r /? (A n )))<iA i v(Q)(^i,...,^, 



(fttsa \n N 



TtNipiAu A n )) d\ N (Q)(Ax, ...,A r , 



+ 



(M-)"\(M- fl )' 



— TVjv(p(r R (^i), . . . , r fl (A»))) dAjv(Q)(Ai, ...Aj 



N 



Since 



1 



W) n \(AfJ? a )» ^ 



1tjv(p(r-fl(Ai), . . . , r fl (A»))) dAjv(Q)(Ai, . . . , A n ) 



< WpWr (i - x N (Q)((M s N yy n )) ^ o as n -> oo, 

the desired assertion follows from the naturally expected fact that 

T Q 0)=lim / — Tr iV (p(yli,... ) ^ n ))dA i v,ii(Q)(Ai,... ) ^ n ) 



lim 



-TViv(p(Ai, . . . , An)) d\ N (Q)(Ax, A n ), 



where 



Ak,r(Q) := 



^(Q)((M^)") A,v(Q> l(«ffi B )" = ® Vs(<3, 



1=1 



XN,Fl(Qi) ■' 



-Xn(Qi) 



X N (Qi)(M s N y) 

Indeed, this is a simple consequence of an asymptotic freeness result in 9 ( 4.3.5], slightly 
generalizing Voiculescu's original in j^2] to the setup in almost sure sense as well as to general 
unitarily invariant self-adjoint random matrices. Also, one should note that Xn,r(Qi) still 
weakly converges to fj,Q i thanks to (|2.2|) . □ 

We are now in a position to state the main result of this section. 

Theorem 2.2. Assume that there exists a constant p > such that all Qi(x) — ^x 2 , 1 <i <n, 
are convex on R (so that the condition (|2.1j) automatically holds). Assume R > Ro with Rq 
given above. Then we have 



-[-x(r)+r[Y,Q,(X.)) +Bq 



W 2 (t,t q ) < 

for every r 6 TS'f.A^ V where 

B Q := ^ £ log ^(Q.) + 2 log tf) 



(2.3) 
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Since the equilibrium measure with respect to Q(x) = x 2 /2 is the (0, l)-semicircular dis- 
tribution 70,2 and 

(see e.g. [SJ 4.4.6 and pp. 185-186]), the above theorem includes the free analogue of Tala- 
grand's TCI as follows. 

Corollary 2.3. If R > 2 and 7*™ G TS(A ( ^ ) ) is the ( non- commutative) distribution of a 
standard semicircular system, then 



■kn 

2 



< 



i 



for every r G TS (A R ^ ) . 

To prove Theorem 12.21 we need the following: 

Lemma 2.4. Assume the same assumption for Qi 's with a constant p > 0. Then, for every 
N G N and every A G V((M 



]y) n ), we have 



W 2 (X,Xn(Q))< ] J^ S(X,Xn(Q)), 
where S(X, Ajv(Q)) is the relative entropy of X with respect to Ajv(Q)- 
Proof. Since all Qi(x) — ^x 2 are convex on R, so is 

n 

(Ax, ■ ■ -,A n ) G (MffY ^nY^^n (Qi(Ai) - P -A\) . 

i=i 

(This is the reason why the multiple constant 1/iV appears, see j 1 "21 p. 212].) Hence, the TCI 
for measures on Euclidean spaces (see |16l Theorem 6.5]) slightly generalizing Talagrand's 
original implies the desired inequality with regarding (M^) n as M. N n . □ 

Proof of Theorem \2.S\ First, note that the existence of the limit in p. 3)1 is in [9, 5.4.3]. When 
x( T ) = —00 nothing has to be done so that let us assume x( T ) > —00. Recall that 

x(t) = Xr(^t(Xx), . . . ,TT T (X n )) 

= lim lim S np(^logA% n (T R (n T (X 1 ),...,7r T (X n );N,m,e)) + ^logiV) , 
™\<f N -*°° V v 2 J 

where r R (ir T (Xi), . . . , 7r T (X n ); N, m, e) is the set of all re-tuples (Ax, . . . , A n ) G (Mff R ) n such 
that 
1 



N 



Tr N (A h ---A ir )-T(X il ---X i 



Tr N (A h • • • Av) - T(n T (X h ) ■ ■ ■ ir T (X ir )) 



< e 



for all possible ix, ■ ■ ■ ,i r with 1 < r < m. A suitable subsequence -/V(l) < iV(2) < • • • can be 
chosen in such a way that letting 



we get 



T r (t; k) := T R (ir T (Xx), 7r T (X n ); N(k), k, 1/k) 
1 



n 



■\ogA%"(T R (T-,k)) + -logN(k)\. 



(2.4) 
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Looking at this, we introduce the random matrix distribution \N(k),R £ TS(A.^ ) associated 
with the probability measure 

Let h be an arbitrary monomial • • • X; r E As long as r < fc, we get 

i 



-^yTr^^^!,...,^)) - 



T 



J^m^Nik^Ah ■ ■ ■ A ir ) - T{X h ■ ■ ■ X ir 
for all (A\, . . . , A n ) G Tr(t; k), and hence 



1 



A 



■N(k),R( h ) ~ T ( /l ) 



< 



h R (r;k) 



j^-Tr N{k) (h(Ax,...,A n ))-T 



This shows that Ajv(fc),_R(^) — ► T (^) as A; 



d\ k (A 1 ,...,A n ) < 
oo for all monomials h so that we get 



k ' 



By Lemmas II .31 and 12.41 we have 

w 2 (^N(k),Ri ^JV(fe),fl(Q)) 2 

" ^)^(A w(fc) ,A^)(Q)) 2 
2 

- ^]v(A02 5 ^ AAr(fe) ' A7V(fc) ^^ 



lim \ N ( k ),R = r weakly*. 

k— >oo 



P N{k) 2 
2 

P N{kf 
+ 



/ lo S ^ 7?^( Al ' ■ ■ ■ » A.) dAjv( fc )(^i, • • • , A 



-IogA®f fc) (r fl (T;fc)) 



n n \ 

AT(fc) Tr w{A;) (Qi(A)) + X] lo § Z ^v(fc) (Gi) dX N(k) (A 1 ,...,A n ) 
i=i i=i / 



= ^{-(]vW logA ^ (rii(T;fc)) + ? log 



iV(fc) 



The above last formula 



vi=l 
, converges to 

-I \ 



£ log Z* (Jfc) (Qj) + | log iV(fc) 



i=l 



(2.5) 
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as k — > oo thanks to (|2.4[) and (|2.5|) . On the other hand, by the joint lower semi-continuity 
of Wi (see 2° in §§1.3), Lemma 2.1 and (|2.5I) . we have 

W2(t,tq) < hininf ^(A^)^,, Ajv( fc))jR (Q)), 
completing the proof. □ 

3. Free TCI for 77 

We first recall the free pressure tt r and the free entropy-like quantity n R (the Legendre 
transform of ir R ) introduced in [Jj. For R > fixed let (-4.^) Sa and (Mff R ) n be as in §1. 
For each h G (^^) Sa the- /ree pressure 7Tr(/i) of /i is defined by 

7Tr(/i) := limsupf -^-rP N ^ R {h) + ^ log AT" 

where the (microstates) pressure function P/v,_r(/i) is given as 

P N>R (h) := log / exp(-NTr N (h(A u . . . , A n ))) dA% n (A 1 , . . . , A n ). 

Note that 7r# is a convex function on (A^) sa such that |vr^(/ii) — 7Tr(/i2)| < ||/ii — /i^Hil for 
all h lt h 2 G (^R } ) Sa . For r € T5(^ ) ) the quantity 77_r(t) is defined by 

tir(t) := inf{<r(/») + tt^) : h G (4?)"}- 

We then have 

7rfl(/i) = max{-r(/i) + m (r) : r G T5(^ n) )} 

so that 7T^ on (A^) sa and t/r on TS'(^l^) are the Legendre transforms of each other with 
respect to the Banach space duality between (^ } ) sa and („4^ } )*' sa (D TS(A^ ] )). We say 
that r G TS(AP) is an equilibrium tracial state associated with h G (A^) sa if the equality 

n R (h) = -r(h) + rj R (h) (3.1) 

holds. This equality is a kind of variational principle. 

In this section we will prove the next TCI for non-commutative tracial distributions with 
T] R in place of x- Since xr( t ) — t 1r{ t ) |Ti Theorem 4.5], this TCI is sharper than that given 
in Theorem 12.21 though r becomes restrictive here. But it is worth noting (see also |14| 
V.l.l]) that the set of r G TS(A R ) satisfying the assumption in the theorem is norm-dense 
in{rGTS(4 n) ) :7?r(t)>-oo}. 

Theorem 3.1. Let Q = (Qi, . . . ,Q n ) be an n-tuple of real-valued continuous functions on 
R ; and assume that there exists a constant p > such that all Qi(x) — ^x 2 , 1 < i < n, are 

convex on R. Assume R > Rq with R$ given in 12. If re TS(A { n ] ) is an equilibrium tracial 

state associated with some h G (A^) sa , then 



W 2 (t,t Ci ) < 
where Bq is the constant in (|2.3|) . 



-(-wM+Tlf^ftOXi)) +Bq), (3.2) 



vi=l 
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The essence of the proof is same as that of Theorem 12.21 based on the random matrix 
approximation procedure. For each n £ N and h £ {•A^) sa define \n n(h) £ V((Mff R ) n ) 
by 

dX ^®n ( A i> ■■■,A n )= Zn 1 r ^ exp(-NTr N (h(Ai, a n )))x{M™ R )™{M, ...,>!») 

with the normalization constant Z^ tR (h) := exp(P/v j ^(/i)) . This \N, R (h) is a unique proba- 
bility measure on (Mf^ R ) n satisfying the (microstates) Gibbs variational principle 

P N , R ( h ) = -N 2 X N>R (h)(h) + S(X NjR (h)) (3.3) 

with the Boltzmann-Gibbs entropy 

CV\ fu\\ f dX N>R (h) i d\N,R(h) 8 

J ( M N, R ) n dA N dA N 



Proof of Theorem 3.1. We may prove that 

^ 2 (r , tq) 2 < i ^-7r fl (/io) + ro (j2 ~ + B oj ( 3 - 4 ) 

when tq G T5(^4^ ) and /io £ (-^-i^)™ satisfy the variational equality (|3.1jl . Let us first 
assume that to is a unique equilibrium tracial state associated with ho (equivalently, ir R is 
differentiable at ho). Choose a subsequence N(l) < N(2) < • • • such that 

ttr (h ) = ^(ww P mk)A h °) + |logJV(fc)) (3.5) 

and Ajv(jt) ii(^o) weakly* converges to some t\ € TS(A^). For every /i G (-A^) sa we get 

^N(k),R( h o){h) + N ^ 2 PN(k),R( h ) 

~ N(k) 2S ( XN(k) > R '' ho ^ = ^ N ( k )> R ( h °}( h °} + N ^2 P N(k), R ( h o) 

thanks to ()3.3|) , From this and ()3.5|) as well as the weak* convergence of X N ^ R (ho) it is 
easy to see that 

n(h) + 7T R (h) > Ti(ho) + n R (ho) 
so that T\ is an equilibrium tracial state associated with ho- Therefore, 

^N(k), R ( h o) — *■ T o weakly* as k -> oo. (3.6) 
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For every N G N let Ajv(Q), Zjy(Q) and Xn,r(Q) be defined as in §2. By Lemmas II. 31 and 
12.41 as in the proof of Theorem 12.21 we have 

lw 2 (X N , R (ho),\N,R(Q)) 

s 1 f 1 dX N,R( h o) ^ f U \ 

- ~TH / lQ g T\ iTvi d ^N,R{h ) 

pN 2 J(M° N \) n d\ N (Q) 



thanks to (|3.1I) . Now, restrict the above estimates to the subsequence iV(l) < iV(2) < • • • 
and apply ()3.5j) as well as (|2.3j) . By 2° in §§1.3 together with Lemma l2~T1 and (|3.6|) . we then 
obtain 

1 2 1 ^ ~ 2 

-W 2 (to,tq) < lhninf-W / 2(A A r (A .) ijR (/i ), A A r (fc)iR (Q)) 

< -Tr R {h ) + r ^ Qi(Xi) - ho\ + 5 Q . 

Next, assume that To is a not necessarily unique equilibrium tracial state associated with 
/io- According to _f5| (also 6.2.43]), tq belongs to the weakly* closed convex hull of the 

set To of t G T5 , (^4^ l ' ) ) for which there exist hk G (^i?^) Sa an d T k G T5(.A^) such that T k 
is a unique equilibrium tracial state associated with h k for each fceN, — /to||ij — ► and 
Tfc — > r weakly*. To show (j3.4j) for ro and ho, it suffices thanks to 2° and 3° in §§1.3 to prove 
it for every t £ % and ho- Let h k and T& be as in the description of the set To. Then, the 
above-proven case implies that 



+ B t 



Q 



for all k G N. Hence 1)3. 4JI for r and ho is obtained by letting k — > oo in view of 2° in §§1.3 
and the norm-continuity of 7r^; thus the proof is completed. □ 

Corollary 3.2. Let Q = . . . , Q n ) be an n-tuple of real-valued continuous functions on R 
wzi/i the same assumption as in Theorem 3. 1 for some p > 0, and let R > be as in Theorem 
3.1. Then tq is a unique equilibrium tracial state associated with Y17=i Qi(X-i) G (A^) sa . 

Proof. If To G TS(Aft^) is an equilibrium tracial state associated with Y^i=l Qi(Xi), then the 
right-hand side of (|3.2I) (or (|3.4[) ) is zero so that to = tq. □ 

In particular, let Qi(x) = x 2 /2, 1 < i < n, and R > 2. If t G T5(^ } ) is an equilibrium 
tracial state associated with some h G {A^) S \ then 



^2(r,7o*2 n ) < 



+ 2 lo S 2?r 



where 7*^ denotes the distribution of a standard semicircular system. Hence, 7^ is a unique 
equilibrium tracial state associated with \ Y17=i Xf- This also says that rj R admits a unique 
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maximizer 7*^ when restricted on {r £ TS(A^) : t(X^=i -^?) — n }> which is a refinement 
of the same result for \ m EH- 

The question whether the TCI ()3.2|) holds or not for any r £ TS'(.A^'' 1 ) without the 
equilibrium assumption is still left open (and seems very important to obtain an in-depth 
understanding of tjr). 

4. The unitary case 

4.1. TCI for Xu- For each n £ N, the universal free product C*-algebra C(T)* n , where T is 
the unit circle, is nothing but the universal group C*-algebra C*(F n ) of the free group F n of 
n generators. Let g\,...,g n denote the canonical n unitary generators of C*(F n ). For each 
r £ TS(C*(¥ n )) take the W^-probability space (7r T (C*(F n ))", f)) via the GNS representation 
7r T and define the free entropy (unitary version) Xu(t) by 

Xu{r) := Xu(^r(9l), • • • , K T (g n )) 

(see §6.5] for the precise definition of the microstates free entropy for n-tuples of unitaries). 

On the other hand, the p-Wasserstein distance W p {t\,T2) between t±,T2 £ TS(C*(¥ n )) 
is defined by (|1,2|) with (gi, ... , g n ) in place of (a\, . . . , a n ). (Note that a%, . . . , a n were not 
necessarily self- adjoint in §§1.3.) 

For each real- valued continuous function Q on T, the functional 

-EOz) + m(Q) := - // log |C - v\ dKO Mv) + I QiO dKO, V G V(T), 



has a unique minimizer \xq called the equilibrium measure associated with Q (see ^2])- When 
Q = (Qi, ■ ■ ■ ,Qn) is an n-tuple of real- valued continuous functions on T, we define tq £ 
TS(C*(¥ n )) as the free product of fM Qi % i.e., r Q := *? =1 M Qi on C*(¥ n ) = C(T)*™. 
The next theorem is the counterpart of Theorem 12.21 in the unitary setting. 

Theorem 4.1. Assume that there exists a constant p > — ^ such that all QAe*^*} — ^t 2 , 
1 < i < n, are convex on M. Then we have 



W 2 (t,t q ) < 
for every r £ TS(C*(¥ n )), where 



-Xu(t) + r ( ^ Qi(gi) ] + B, 



Q 



v« = l 



Bq ■= Xu(tq) - tq f ^ Qj(^) J 

(iSee aZso 3° below for the constant Bq). Furthermore, tq is a unique minimizer of— Xu{ T ) + 
TfctiQM) forT£TS(C*(¥ n )). 

In the special case where QiS are all zero and so p = 0, the above inequality becomes 
W 2 (r, 7o * n ) < 2^Uj), r £ TS(C*(F„)), 

where the free product state 7*™ is the distribution of a standard Haar unitary system of n 
variables. 

A key idea in proving the theorem is to apply the classical TCI in the Riemannian setting 
in a certain random matrix approximation. Here, by a geometric reason on Ricci curvature 
tensors, random matrices at our disposal are special unitary ones instead of unitary. Some 
important facts needed in the proof are in order. 
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1° The SU-microstates free entropy. Even when U(iV) is replaced by SU(iV) in the 
definition of Xu(ui, ■ ■ ■ , u n ) §6.5], the microstates free entropy introduced is the same. To 
prove this, define £ : T N -> {(ft, . . . , (n) € T N : Ci • • • (n = 1} by 

aci, ...,cn)-.= (ci(Ci • • • cn)~ 1/n , cn((i ■ ■ ■ ( N )- l/N ) , 

where ( lj/N for £ G T is the principal iVth root and <^~ 1 / N ■= and define S : U(iV) — ► 

SU(iV) by H(f7) := Fdiag£(Ci, • • • , Gv)^* under a diagonalization U = Vdiag(&, • • • , Cn)V* 
with V G U(iV) and (&, . . . , ( N ) G T N . Then, S is a well-defined Borel measurable map and 
we have 7]^ o H _1 = 7^ (see §§1.1 for notations). Now, the above-mentioned fact can be 
directly shown by using the forms of 7^ and 7^ under diagonalizations (see e.g. |12l §§1.5]). 

2° A key inequality. For each A G V(S\J(N) n ) define the distribution A G TS(C*(¥ n )) 

by 

^Ti N (h(U 1 ,...,U n ))dX(U 1 ,...,U n ), heC*(¥ n ), 

JSV(N) n iv 

where h G C*(F n )) 1— » h(U\, . . . ,U n ) G Mjv(C) is the *-homomorphism ("non-commutative 
functional calculus") sending each gi to C/i for each (Ui, . . . , U n ) G SU(./V). For every Ai, A2 G 
V(SV(N) n ) we have 

W 2 (Ai,A 2 ) < -^=W 2 ,hs(.X 1i X 2 ) < -L^geodCAi.Aa), 
V iv v -/V 



where W2,hs(Ai, A 2 ) is the Wasserstein distance with respect to the distance on SU(iV) n 
induced by the Hilbert-Schmidt norm, while W2 geod 

(Ai,A 2 ) with respect to the geodesic 
distance. The proof of the first inequality is similar to that of Lemma 11.31 while the second 
is obvious. 

3° Asymptotic freeness for SU-random matrices. Let Q = (Qi, ■ ■ ■ ,Q n ) be real- 
valued continuous functions on T. For each n G N define Aat(Q) G T , (S\J(N) n ) by the 
product measure Ajv(Q) := (S>7=i Xn(,Qi) of 

d\ N (Q l )(U) := — --— exp(-7VTr iV (Q l (C/))) djfj 3 (U) 

with a normalization constant Z^iQi)- The asymptotic freeness for unitary random matrices 
due to |22| remains valid for special unitary random matrices. In fact, a stronger result 
on the almost sure asymptotic freeness for independent special unitary random matrices 
can be shown by modifying the proof in [SJ 4.3.5]. Also, as a consequence of the large 
deviation theorem |13l Theorem 2.1], it follows that the mean eigenvalue distribution of 
Aat(<3i) converges to (jlq. for 1 < i < n. We thus see that 

Ajv(Q) — * r Q = *r=i/^ weakly*. 

Moreover, since 

(see [131 Theorem 2.1]), we notice that the constant Bq in Theorem 14. II can be expressed as 

n n 
i=l i=l 
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4° TCI on SU(iV) n . Let Q = (Qi, Q n ) be as in Theorem O and assume further that 
all Qi's are C 2 -functions. Then, thanks to |12l Lemmas 1.2 and 1.3], the function 

<S N (U 1: ...,U n ) :=NTr N ^J2Qi(uA 

is a C 2 -function on SU(iV) n with the Hessian Hess(\&jv) > N ~ pI(N 2 -i)n- Also, note that the 
Ricci curvature tensor of SU(iV) n is Ric(SU(iV) n ) = ^I {N 2_ l)n . Hence, by the TCI in the 
Riemannian manifold setting due to 18 combined with 0, we obtain 



W 2>geod (X,X N (Q)) < + 2p ) g(A ' Ajv(Q)) 

for every A G P(SU(iV) n ). 

Now, the proof of Theorem 14.11 based on the above facts l°-4° is analogous to that of 
Theorem 12.21 so the details are left to the reader. But, it is worthwhile to note one more 
point. As in the proof of [121 Theorem 2.7], the regularization technique by the use of Poisson 
integrals enables us to assume that all QiS are smooth functions on T so that one can go 
through with 4°. 

4.2. TCI for rj u . For each h G C*(F n) sa we introduce the free pressure (unitary version) 
■K u (h) by 

7T u (h) 

:=limsup(-^log ( e W (-NTr N (h(U 1 ,...,U n )))d(^)® n (U 1 ,...,U n )\ 
= limsup(4flog / exp{-NTr N (h(U u ...,U n )))d(^)® n (U u ...,U n ) ). 

N^oo \^ JSV(N) n J 

The equality of the two lim sup's can be shown from the fact stated in the above 1°. As in 
the self-adjoint setting 7, Proposition 2.3], ir u is convex on C*(F n ) sa and \n u (hi) — n u (ti2)\ < 
\\hi-h 2 \\ for all h l7 h 2 £ C*(F n ) sa . It is seen as in Theorem 3.4] (or (JXTJ)) that tt u is the 
converse Legendre transform of rj u as 

7r u (h) = max{-r(/i) + Vu (r) : r G TS(C*(¥ n ))}, h G C*(¥ n ) sa , 

and we say that r is an equilibrium tracial state associated with h if ir u (h) = —t{K) + r] u (h) 
holds. 

In particular when N = 1 and \i G V(T) (= T5(C(T))) we have r] u (fi) = Xu(fJ') (= ^(^)) 
(see [5J §6]). The proof of the next theorem is similar to that of Theorem 4.5]. 

Theorem 4.2. We have Xu( T ) < Vu(t) for every r G TS(C*(¥ n )). Moreover, if r is a free 
product tracial state (i.e., gx,...,g n are *-free with respect to r), then Xu( T ) = Vu{ T )- 

Furthermore, by considering the minimal C*-tensor product C*(F n ) ® m i n C*(F n ) as in 
§6], the definition of rj u can be modified so that the modified r/ n ( r ) is equal to Xu(i~) for all 
r G TS(C*(¥ n )). This result is of considerable importance but it is not directly related to 
the free TCI in Theorem 14.31 so we omit the details. 

Finally, we state the counterpart of Theorem 13. II in the unitary setting; the TCI is sharper 
than that in Theorem 14.11 though r is rather restricted. The structure of the proof is quite 
parallel with that of Theorem 13 . 1 1 and the details are again left to the reader. 
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Theorem 4.3. Let Q = (Qi, ■ ■ ■ ,Q n ) be real-valued continuous functions on T satisfying 
the same assumption as in Theorem \4-l\ with a constant p > —5. If r € TS(C*(¥ n )) is an 
equilibrium tracial state associated with some h £ C*(F n ) sa , then 



W 2 (t,t q ) < 



-Vu(t)+t (^Qi(Si) J +B Q , 



vi=l 



where Bq is the same constant as in Theorem \4-l\ Furthermore, tq is a unique equilibrium 
tracial state associated with YH=\Qi{9i) C*(¥ n ) sa . 
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